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Abstract: In this paper we consider a random variable Y contamined by 
an independent additive noise Z . We assume that Z has known distribution. 
Our purpose is to test the distribution of the unobserved random variable 
Y . We propose a data driven statistic based on a development of the density 
of Y + Z, which is valid in the discrete case and in the continuous case. 
The test is illustrated in both cases. 
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1. Introduction 

Consider the convolution 

X = Y + Z, (1.1) 



where Y and Z are independent r.v. with densities / and g w.r.t. a reference 
measure /i. According to the nature of the variable, fi can be the Lebcsgue 
measure on R (or on interval), or the counting measure ^2 5 n , where 5 n (x) = 1 
if x = n and otherwise. The error term Z is assumed to have a known density 
g. Observing X instead of Y and Z yields to a deconvolution problem which 
consists in distinguishing the two components of the variable. Our purpose is to 
test the distribution of Y. 

1 
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Different authors have considered the nonparametric deconvolution problem 
of estimating / or its associated distribution function from i.i.d. observations; 
that is, to recover the distribution of Y using the contaminated measurements 
X. Nonparametric kernel- type estimation of / based on contaminated data was 
studied by (2), (6) and (3) among others. This problem is also related to mixture 
problem since (1.1) may be seen as a location mixture J f(x — m)TL(dm), with 
m the location parameter and II the known mixing distribution (see for instance 
(4) and (11)). 

Recently, (7) proposed a new procedure for testing the density of Y. Their 
method is based on Fourier transforms of variables Y, Z and X (via its nonpara- 
metric estimator). In this paper we also use a nonparametric representation of 
the density h of X to test the distribution of Y. But our approach is based on 
a polynomial expansion of the density h under the null hypothesis. The com- 
parison of this density with the empirical one allows to build a test statistic. 
A data driven approach permits to select automatically the number of com- 
ponents of the statistic. In fact, we consider the same problem than (7), but 
instead of considering the problem as a L 2 distance between a nonparametric 
deconvolution estimator and a smoothed version of the density /, we restrict our 
attention to a known reference measure \i and its associated L 2 basis where we 
represent all densities. In (7), the authors need to make different assumptions 
on Fourier transforms and the use of kernel density estimators requires classical 
supplementary assumptions on the bandwidth. In the procedure we proposed in 
this paper we did not need such parameter thanks to the use of a data driven 
technic. Then our additional assumption is that densities are squared integrable 
with respect to \i. Also, for asymptotic results we make technical conditions on 
eigenvalues of the variance matrix. But in practice, the test may be used without 
data driven procedure for a large enough number of statistics components. 

More recently, such data driven approach has been used for deconvolution 
problem: in (9), the problem of testing the density of Y is considered. The idea 
is based on the convolution formula obtained in the continuous case; that is, 
when the reference measure \x is the Lebesgue one. In that case the author 
constructed a score test statistic which is combined with a model selection rule. 
In our paper, the data driven method is close to the work of (9). Also our test 
statistic can be related to the score one (see Remark 2.1). However we did not use 
the classical convolution formula, what allows us to consider discrete convolution 
problems, as that illustrated in our simulations. Also it permits to introduce 
a dependance between Y and Z as we underline it then. But the extension 
in a multivariate setting, as done in (7), requires here boring calculus because 
orthogonal polynomials are not well developed in this frame. Finally, the novelty 
of our test is that we can also test discrete convolution while in our knowledge 
it is essentially the continuous case which is studied in the literature. Also, we 
extend our study to the dependent case, testing the conditional distribution of 
Y\Z, 

The paper is organised as follows: in Section 2 we introduce the method based 
on the polynomial expansion of the (null) density and we propose a data driven 
procedure for testing the distribution of the contaminated density. In Section 3 
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a simulation study shows some comportments of the proposed test. 
2. Method 

Let Xi, ■ ■ ■ ,X n be i.i.d. random variables with density h, satisfying the convo- 
lution formula (1.1). We assume that g is known and we want to test 

H ■ f = fo against / ^ f Q , 

where fo is a fixed density. Let \x be a probability measure on K with density 
m and such that both distributions of Y and X are dominated by [i. Denote by 
B = {Qi\ i = 0, 1, ■ ■ • } an associated basis of dense orthogonal functions with 
respect to /i. When h belongs to L 2 (fi) (what we shall suppose) we have the 
following expansion: 

h(x) = J^aiQiix), (2.1) 

where a; = E(Qi(X)m(X)). Our method consists in comparing the coefficients 
a's based on this expansion with those obtained under the null hypothesis. Under 
Hq, equality (2.1) may be rewritten as 

h(x) = ^^{Qi{Y + Z)m(Y + Z))Qi{x) 

Testing H is equivalent to test 

H : ai — oti Vi = 1,2, • • • , 

Under H these coefficients a's can be easy to calculate with a good choice of 
the reference measure and its associated polynomials (see illustrations below). 
Thus, a natural statistic can be constructed as follows: for some integer fc, write 

B k = (&!,••• A), 

where 

1 - 

b 3 = -^(^5,(1X1,)-^). 

v n i= i 

By the Central Limit Theorem, we have the following convergence under Hq: 

U k = T.- 1/2 Bl -^ C N(0,I), 
where is the k x k covariance matrix Var(Bk). 
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Under H Ql T k = ||£4|| 2 is asymptotically Chi-squarcd distributed with k 
degrees of freedom and then we reject the null hypothesis for large values of T k . 
But the arbitrary choice of k is the weakness of this method and we adapt a 
data driven method to select the number of components in the test statistics. 
Imitating the work of (10), we consider an increasing sequence of number of 
components k{n) such that lim„^oo k(n) = oo. Our selection rule is based on 
the Schwarz criteria. Write 

S n = min { argmax (T k — /clog(n))}. (2-2) 

l<k<k(n) 

Then our data driven test statistic is Ts n . 

Proposition 2.1. Let X k (n) be the smallest eigenvalue ofTi k r n y Assume that 
logfc(n) _ 0p (\ g n y Then, under Hq, T$ n converges in distribution to a Chi- 

k(n) 

squared random variable with 1 degree of freedom. 

Proof. Under Hq, T\ converges to a Chi-squared random variable with one de- 
gree of freedom and then we have to show that P(S n = 1) tends to zero. 

Since (S n = k) implies (T k - fclog(n) ^ T x - log(ra)) we have F(S n = k) < 
P(T k >(k- l)log{n)). Then we have 

k(n) k(n) , v 

P(S n > 2) = £>(S„ = k) < ^P(T fc > (k - l)log(n) V (2.3) 

k=2 k=2 ^ ' 

Using the fact that j- = sup XeR » fc — ^r^— , we obtain 

T k = t B k ^ 1 B k < (2.4) 

and 

w(r k > (fc-l)log(n)) <p(\\B k \\ 2 > Ajk(*-l)log(n)J. (2.5) 

As Sfc is an Hilbert Schmidt operator, its trace is bounded by a constant, say 
M, independently of k, and we have (under Hq) 



E(\\B k \\ 2 ) = Var{B k ) 

= ^(E(Q 2 (X)m(X)) 2 -( 



i=l 

= Tmce(E fc ) < M. 
Combining (2.5) with Markov inequality and the above result yields 

""-H ^-nw (2 - 6) 
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Finally, using a result on harmonic sum we get 5Zfe=i X = 0+oo(logn) and 
we obtain 

P(^2)<. M 



infi<fc<fc(„) A fc V logn 

Since matrices E^. are embedded. is a decreasing sequence and we get the 
result. 

□ 

Remark 2.1. It is clear that our hypotheses can be rewritten Hq : 9 = against 
Hx : 9 ^ 0, where 9 = E(V(k)) with V(k) = (Qi(X)m(X) - ai) i=lj ... k . Then 

Tk coincides with the score statistic if the maximum likelihood estimator 9 of 
9 equals the empirical mean of the sample of the V{k) 's, that is 9 = -^=, as it 
is the case for instance when the distribution ofV{k) belongs to an exponential 
family. 



3. Some particular cases 

3.1. Continuous positive case 

Assume that Y and Z are continuous positive random variables. We choose [i 
the exponential distribution with mean 1 and Qi = L^i its associated Laguerre 
orthogonal polynomials (see (1)). In Appendix we recall some basic properties of 
Laguerre and generalized Laguerre polynomials. Equality (2.1) may be rewritten 

as 

h{x) = ^E(Q. i (r + Z)cxp(-y)cxp(-Z))Q,(x) 

= (J2 Ci ' sE ( Es >-»( Y ) E i-°A z )))Qi(x)> 

iefi s<i 

where E SyU {x) — L StU (x) cxp(— x)), Ci tS are coefficients given in Appendix and 
u, v are arbitral' positive reals satisfying u + v = 1. Under Hq, using the previous 
decomposition, we have: 

Ey = E(Q i (Y + Z)Q j (Y + Z)exp(-2Y)eTq>(-2Z))-a i a j 

s<i t<j 

where E {s ^ yll (x) = L StU (x)L t , u {x) exp(-2x)). 

3.2. Continuous bounded case 

Assume that Y and Z are bounded, w.l.o.g. taking values in [0; 1]. We choose fj, 
the uniform distribution and Qi its associated Legendre orthogonal polynomials 
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described in Appendix. Then equality (2.1) may be rewritten as 
h{x) = ^E(Q 4 (y + Z))Qi{x) 

= E( E C^MY'Z^Qiix), 

ieN s+t<i 

where Cj jSi 4 are coefficients obtained by expansion of Qi. Under Hq : 

E« = E(Qi(y + Z)Qj(y + 

= E E a,^ f Q, u ,„E(y s+tl )E(Z* +t ') - cuaj. 

s+t<i u-\-v<j 

3.3. Discrete case 

Assume that Y and Z are discrete random variables taking values in N. We 
choose [i the geometric distribution with probability fx(x) = p x (l — p) for x = 
0,1, ■ ■ ■ , p £ (0, 1) and Qi = M^i its associated Meixner orthogonal polynomials 
described in Appendix. Equality (2.1) may be rewritten as 

h(x) = ^EtQ.fy + zjpVfi-pDftW 

iGN 

= X] izZ^^^EisAZ^Qiix), 

iGN s<i 

where E StU (x) = E(M Sitl (x)p a: (l — p) 1 / 2 ), Cj iS are coefficients given in Appendix 
and u,v are arbitrar positive reals satisfying u + v = 1. Under Hq, using the 
previous decomposition, we have : 

= E(Q i (Y + Z)Q j (Y + Z)p 2Y p 2Z (l-p) 2 )-a i a j 
= ^^C,, i C y E(i;( M)|11 (F)E( i _ Jij _t) it (Z)) - aittj-, 

where E {Sit) ^ u (x) = M StU (x)M tlU (x)p 2x (l -p). 
3-4- Dependent case 

When Y and Z are not independent, problem (1.1) can be traited by condition- 
ning with respect to the noise Z . In that case, hypotheses concern the conditional 
distribution of Y\Z; that is Hq : f Y \z = /o(-i Z). We can use the same approach, 
replacing coefficients cti by the quantities 

E(E(Qi(y + Z)m(Y + Z)\Z)). 
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4. Illustrations 

In this section we present the results of two simulation studies of our testing 
procedure. We consider i.i.d. data X±, ■ ■ ■ , X n generated from two convolution 
models (1.1) satisfying: 

First model (Modi): Y has exponential distribution with mean 1 and Z is 
Chi-squarcd distributed with 1 degree of freedom. Three alternatives are studied: 

Altl : instead of (1.1), X is a mixture with two components : 

50% exponential with mean 2 and 50% Chi — squared with 2 degrees of freedom. 

Alt2 : convolution (1.1) with both Y and Z exponential distributed with mean 1. 

Alt3 : convolution (1.1) with both Y and Z Chi — Squared distributed with degree 1. 

Second model (Mod2): Y has Poisson distribution with mean 1 and Z has 
Geometric distributed with mean 1. Three alternatives are proposed: 

Alt4 : instead of (1.1), X is a mixture with two components : 
50% Poisson with mean 2 and 50% Geometric with mean 2. 

Alt5 : convolution (1.1) with both Y and Z Poisson distributed with mean 1. 

Alt6 : convolution (1.1) with both Y and Z Geometric distributed with mean 1. 

It is clear that for Models 1 and 2, the two convolution's components have dis- 
tributions with relatively close characteristics and we are interested in detecting 
a confusion between these components (Alternatives 2-3 and Alternatives 5-6) 
or we are interested in detecting a mixture of these two components instead of 
a convolution (Alternative 1 and Alternative 4). 

For each model and alternative, we compute the test statistic based on a 
sample size n = 50, 100 and 500 for a theoretical level a = 5%. The empirical 
level (resp. power) of the test is defined as the percentage of rejection of the 
null hypothesis over 10000 replications of the test statistic under H (resp. 
under Alternative). We can see that for Alternative 2, the power is weak for 
small samples. For Alternative 4 the power is very low. Then the mixture from 
Alternative 4 and the convolution from Model 2 are very close and it is very 
difficult to distinguish them. 

Figure 1 here 
Figure 2 here 
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5. Appendix: Orthogonal polynomials 

We follow the notation used in (1). 

Laguerre polynomials Lagucrrc polynomials {L n .i;n £ N} arc defined by 
their recurrence relation 

L ,i = 1 L lt i(x) = l-x 

(i + l)L i+1)1 (x) = {2i + 1 - x)Li,x(x) - iLi_x t i(x) 

They are orthogonal w.r.t. the exponential distribution with density f(x) — 
cxp(— x) on R + . For a > 0, Generalized Laguerre polynomials {L na ; n £ N} are 
orthogonal w.r.t. the Gamma distribution with density f(x, a) = exp(— x)x a ~ 1 T(a) 
They satisfy 

Lo,a = 1 L± a (x) = a — x 

(i + l)L i+lta (x) = (2i + a-x)L i>a (x)-(i + a-l)Li-i ta (x) 

Their norms are given by ||Li. Q || 2 = a~ 2t (i\)T(i + a)/T(a). In the exponential 
case we simply get HL^iH 2 = (i!) 2 . These polynomials satisfy the following 
relation that we used in our simulations study (see (5) or (8) for an idea of the 
proof) for a = u + v, u > 0, v > 0: 

L n , a (y + Z ) = '^2C S:n u s L s ^ u (y)v n - s L n _^ v (z) (5.1) 
where C Stn = n!/(s!(n — s)\). 

Legendre polynomials For fj, the uniform distribution on (0,1), associated 

imsart-ejs ver. 2008/08/29 file: ejs_2009_364.tex date: January 27, 2009 



/ 



orthogonal polynomials arc (shifted) Legcndrc polynomials defined by the re- 
currence relation : 

P = 1 Pi(x) = 2a;-1 
(n + l)P n+1 (x) = (2n+l)xP n (x)-nP n -i(x), 

and sastifying ||P„|| 2 = J* P n (x) 2 dx = (2n + l)" 1 . 

Meixner polynomials For fi the Pascal distribution, fi(x) = (1 — p)p x , p £ 
(0, 1), ieN, the associated orthogonal polynomials satisfy the following recur- 
rence formula: 



M = 1 
p/(l-p)M n+1 {x) 



Mi(x) = 1 — p — x/p 

((p - l)x + (1 + p)n + p)M n {x) - (1 - p)n 2 M n -i(x), 



with ||M„|| 2 = p~ n (l — p) 2n (nl) 2 . They are particular case of Meixner polyno- 
mials M n -b tC with 6=1 and c = p. These polynomials also satisfy a relation 
similar to (5.1) that we used for Illustrations: 



M n;btC (y + z) = C, |B M, |U[C (y)M n - w (z) 
where u + v = 6 and C s _ n — n!/(s!(n — s)!). 



(5.2) 



Modi (level) 



Alt3 (power) 



Fig 1. Empirical level for Model 1 and empirical powers for Alternatives 1-3, with a = 5%, 
n = 50, 100,500 
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Sample size 



Fig 2. Empirical level for Model 2 and empirical powers for Alternatives \-6, with a = 5%, 
n = 50, 100,500 
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